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Hereby declare - 

(a) That we are in possession of an invention titled 

"NOVEL ALGORITHM FOR LOSSLESS DATA COMPRESSION" 

(b) That the Provisional Specification relating to this invention is filed 
with this application. 

(c) That there is no lawful ground of objection to the grant of a patent to 




Further declare that the inventor for the said invention is, 

Arvind Thiagarajan 

H 24/6, Vaigai Street, Besant Nagar 
Chennai 600090. Nationality - Indian 



We, claim the priority from the application(s) field in convention countries, 
particulars of which are as follows:- 

Not applicable 



We state that the said invention is an improvement in or modification of the 
invention, the particulars of which are as follows and of which we are the 
applicant/patentee: 



Not applicable 



We state that the application is divided out of our application, the particulars 
of which are given below and pray that is application deemed to have been 
filed on under section 16 of the act. 
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That we are the assignee of the true and first inventor. 



Not applicable 



7. That our address for service in India is as follows: 

Matrix View Technologies (India) Private Limited 

No.69, Mahalakshmi Koil Sreet 
Kalakshetra Colony, Besant Nagar 
Chennai 600090. TAMILNADU. INDIA. 



8. Following declaration was given by the inventor or applicant in the 

convention country declare that the applicant herein is our assignee or legal 
representative 

Not applicable 



9. That to the best of my knowledge, information and belief the facts and 
matters stated herein the correct and that there is no lawful ground of 
objection to the grant of patent to us on this application. 



Mr. Arvind Thiagarajan 
(Inventor) 

10. Following are the attachment with the application: 

a) Provisional specification (3 copies) 

b) Fee of Rs. 



I request that a patent may be granted to us for the said invention 
Dated at Chennai on this 11 th day of December, 2003 f 

Mr. Anand Thyagarajan 
(Authorized Signatory) 



To 

The Controller of Patents 
The Patent Office 
At Chennai 
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Section 10 



"NOVEL ALGORITHM FOR LOSSLESS DATA COMPRESSION' 



Applicant: 

ARVXND THIAGARAJAN 
H 24/6, Vaigai Street 
Besant Nagar, Chennai 600090 
TAMILNADU. INDIA. 



The following Provisional Specification describes the nature of the invention and the 
manner in which it is to be performed. 



Field of Invention 

The present invention relates to the compression of image or other highly 
correlated data streams 

Background of Invention 

The role of data and image compression assumes significant importance as 
the world makes a paradigm shift from analog to digital systems. Data 
compression, which was impossible due to the inherent disadvantages of the analog 
systems, has become a feasible reality with digital systems. The computational 
overheads and the complexity posed the most serious threat to the development 
data compression. With the advent of high-speed digital processors with MIPS 
capability most of these problems have been overcome. 

Image compression has many practical applications, which are driven by the 
fact that image data is a highly correlated data stream. Image compression can be 
either lossy or lossless depending on the criticality and nature of the application. 
The human eye is more sensitive to changes in luminance than to changes in color. 
Hence for applications that are not critical in nature i.e. in cases where the quality 
of the compressed data is not an important factor for further processing lossy 
compression can be employed. The portions of the image data that do not produce 
a perceptible visual difference are removed resulting in excellent compression 
ratios. There are applications where image distortion is totally unacceptable, which 
require only lossless compression, where portions of image cannot be removed no 
matter how inconsequential the data is, resulting in very low compression ratios. An 
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ideal solution to this problem will be a lossless compression technique that 
produces significant compression ratios, which is exactly the motivation behind this 
novel and unique invention. 

Data Compression Principles 

Ail the data compression techniques are based on the fundamental principle of 
Shannon's Information theory, which says that there is a limit to the number of bits 
required to code a unique symbol, called entropy, given by 

H = - pi log 2 Pi 

where p, is the probability of occurrence of the symbol. The implication of this 
equation is that if a symbol occurs many times, i.e. the frequency of occurrence is 
high then this symbol contributes to redundancy and is hence given lesser priority 
when compared to a symbol whose frequency of occurrence is much lesser. This 
forms the basis for all the entropy coding or source coding schemes. The idea is to 
give a shorter codeword to more probable events i.e. the more frequently the 
symbol occurs, the SHORTER it's codeword is. Image data follows a Laplacian 
distribution, which means that the occurrence of each symbol is equiprbbable. 
Hence all the symbols require almost the same number of bits resulting in very low 
compression ratios. To achieve high compression ratios we should transform the 
image data stream in a manner where the even probability distribution in the 
original image is transformed to a probability distribution that has a few symbols 
having a high frequency of occurrence the other symbols a relatively low frequency 
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of occurrence, resulting in a significant reduction in the bits per symbol, thereby 
enhancing the compression ratios. 

Some of the popular entropy encoders are Run Length Encoder, Huffman, 
Shannon Fano, Limpel-Ziv, Arithmetic Encoder etc. All the encoding techniques 
with the exception of the arithmetic encoder allot a minimum of atleast one bit per 
symbol. The arithmetic encoder, whose unique algorithm generates a real number 
for a given sequence of symbols, can theoretically achieve bit rates of less than one 
bit per symbol. 

Current Image Compression Technologies 

Image compression technologies can be broadly classified as either Lossy or 
lossless. An image compression technology can be classified as Lossy or Lossless 
depending on whether the subsequent decompression of the compressed data 
produces an exact pixel-to-pixel replica of the original data or not. 

We can logically infer from the section on Data Compression Principles that 
any efficient compression technique requires a transformation also known as pre - 
coding, which in turn aids in increasing the efficiency of the second step, the 
entropy coder. At this stage it must be emphasized that if the entropy coder has to 
produce good compression ratios then the pre - coding should transform the data 
into a form suitable for the entropy code. If the transformation is not efficient 
enough then the entropy coder is rendered redundant. Hence it can be logically 
concluded that the pre-coding or the transformation is the most important stage of 
any image compression algorithm. 



The most popular pre-coding transformation used in image compression is 
the Discrete Cosine Transform (DCT). This transformation gives the frequency and 
extent of data change inside the image. Another important property of any 
transformation is that it should be reversible too, so that the reverse process can 
be applied at the decompression stage to obtain the original image. This 
transformation is extensively used in the JPEG algorithms and its variants. 

As indicated above DCT is a reversible transform whose forward transform is 
given as 

where C(x) = -j= if x = 0, else 1 if x > 0. . 

DCT(i, j) = C(i).C(j).IIf(x,y) 

The above-mentioned technique poses the following problems 

The complexity of the equation in terms of the number of multiplications and 
additions, The most straightforward way to implement the DCT is to use the 
defining Equation In the 2D case, with arrays of dimensions NxN r the number of 
multiplications is on the order of fusing a separable approach of computing ID 
row and column DCT's. Specifically, for an 8 x 8 pixel array, which is used in the 
JPEG family which has 1024 multiplications and 896 additions. In spite of the 
tremendous improvement made in terms of reducing the number of computations, 
the reduction has not been significant enough to reduce the tremendous overhead 
it places on the hardware that implements the algorithm. 



Even though the image data is an integer, their multiplication to cosine terms 
in the formula produces fractional numbers or real numbers because cosine values 
are fractional in nature until and unless the integer is in multiples of Pi, which might 
not be the case. Since fractional numbers need infinite precision to store them 
exactly, they might produce errors in the reverse process resulting in losses, which 
mean that they are no longer pixel to pixel lossless. 

Another popular transformation used is called the wavelet transform, which is 
used in the latest image compression techniques like JPEG2000. This uses a mother 
wavelet to decompose the image data into frequency sub - bands, which in turn 
increases the redundancy in most of the sub - bands hence improving compression 
ratios. Used in their original form the mother wavelets do not give integer-to- 
integer transformation but when used after a process called lifting they become 
integer-to-integer transforms thereby making the entire process lossless. 

Color Transformations also offer an interesting prospect to compression. 
Commonly used color space is RGB where every pixel is quantized by using a 
combination of Red, Green and Blue (Primary Colors) values. This format is ideally 
suited for designers but no so ideal for a compression algorithm. As indicated in the 
human eye is more sensitive to luminance than color hence Chrominance 
Luminance and Value format offers an interesting perspective to compression. 

Description 

Image data is highly correlated i.e., adjacent pixels are closely related. Hence 
it is possible to create a significant redundancy, which is then followed by a unique 
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combination of existing data transforms and source encoders to achieve higher 
compression ratios. 

Repetition Coded Compression provides a unique solution where in we can 
achieve higher compression ratios without having to make a compromise in quality. 
This essentially means that Repetition Coded Compression can achieve very high 
compression ratios maintaining the pixel-to- pixel integrity of the image data during 
the compression and decompression process. Repetition Coded Compression is an 
algorithm that exploits the close correlation between adjacent pixels. 

Repetition Coded Compression divides the Pre Coding block of the 
compression process into two logical stages, the transformation and the data - 
rearrangement stage. This transformed and re-arranged data is passed as an input 
to the source coder, which comprises of an arithmetic coder preceded by a Run 
length encoder. Repetition Coded Compression 's transformation primarily has four 
variants 

• Repetition Coded Compression Horizontal 

• Repetition Coded Compression Vertical 

• Repetition Coded Compression Predict 

• Repetition Coded Compression Multidimensional 

Repetition Coded Compression Horizontal, Repetition Coded Compression 
Vertical and Repetition Coded Compression Predict can also be classified as 1 - D 
Repetition Coded Compression category and Repetition Coded Compression 
Multidimensional can be classified 2 - D Repetition Coded Compression category. 



The data re - arranging stage of Repetition Coded Compression comprises of the 
following steps 

• Reversible - Sort process 

• Last to First re-arrangement 

Applications of the Present Invention 

Repetition Coded Compression can be used in a wide gamut of applications 
ranging from Medical Imaging to Digital Entertainment to Document management. 
Each of these verticals requires Repetition Coded Compression to be implemented 
in its own unique way to deliver a robust and powerful end product. 

Repetition Coded Compression could be deployed in the following forms for 
commercialization. 

1. Chip - (ASIC, FPGA etc.) 

2. DSP, Embedded Systems 

3. Standalone Hardware boxes 

4. Licensable Software (as DLL's OCX etc.) 

5. Software deliverables 

Thus, the above mentioned account describes the invention in detail. It is 
intended that the foregoing description is only illustrative of the present invention 
and it is not intended that this unique invention be limited or restricted thereto. 
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Many specific embodiments of this novel invention will be apparent to one, 
skilled in the art from the foregoing disclosure. The scope of the invention should 
be determined not only with reference to the above description but to all other 
additions, substitutions & modification of the present invention without departing 
from the spirit of this invention. 



Abstract 



This invention is a process for compressing highly correlated image data in 
an absolutely lossless manner (i.e. pixel to pixel lossless with zero Means 
Square Error M.S.E). The system for compressing image and other highly 
correlated data comprises means for reshaping the data, means for encoding 
the repetitions and means for storing the compressed data. 

The process of reshaping the image data includes a lossless transformation 
followed by data re -arrangement. The lossless transformation is performed on 
an image data set called pixels, which is transformed into bit - planes and data 
values, using one of the four data transformation algorithms Repetition Coded 
Compression Horizontal, Vertical, Predict and Multi - dimensional. Repetition 
Coded Compression involves is an integer-to-integer transformation that 
converts the integer value of a pixel into another set of integer values to create 
redundancy. This integer to integer transformation is absolutely loss less as 
there is no loss of pixel data unlike other algorithms like JPEG that utilize integer 
to floating point transformations. The floating-point number cannot be 
accurately stored and hence there is a loss of data. 

Repetition Coded Compression uses simple logical operations to increase 
the redundancy in the image. As there is no multiplication or division process in 
Repetition Coded Compression, the image attributes are preserved without any 
loss. Thus the simple transformation works at increasing the redundancy. 

The next process is the re-arrangement of the transformed pixels. This 
process further increases redundancy by sorting and rearranging the data in a 
suitable manner. 
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The redundancy thus created is then passed on to an entropy coder that 
allocates specific codes to the data. The entropy process gives shorter codes to 
the more frequently occurring symbols i.e. the more frequently the symbol 
occurs, the shorter the code. Huffman or arithmetic coding effectively 
compresses the redundancy created by Repetition Coded Compression. The 
encoding maintains the loss less property of the image and at the same time 
producing very good compression ratios. 



